Voice profile: a structured probability model with application to voice morphing
نویسندگان
چکیده
This paper presents the concept of a voice profile as a complete description of the distributions of the acoustic correlates and the speaking characteristics of a speaker. A voice profile can be considered as a unified speakerdependent probability model of speech with applications in speaker identification, adaptive speech recognition, voice morphing and text to speech synthesis. The spectral and temporal parameters that define a voice profile are obtained from hidden Markov models (HMMs) of speech. The HMMs are trained on extended feature vectors that include features for recognition, synthesis and identification. A method of ranking the acoustic correlates of a speaker’s voice is proposed based on an analysis of the relative distance of each voice correlate from that of the gender-dependent modal voice. The voice profile is used effectively for voice conversion. Experimental results of speaker profiling and its evaluation in voice morphing are presented.
منابع مشابه
Analysis of a Modern Voice Morphing Approach using Gaussian Mixture Models for Laryngectomees
This paper proposes a voice morphing system for people suffering from Laryngectomy, which is the surgical removal of all or part of the larynx or the voice box, particularly performed in cases of laryngeal cancer. A primitive method of achieving voice morphing is by extracting the source's vocal coefficients and then converting them into the target speaker's vocal parameters. In this ...
متن کاملProbability models of formant parameters for voice conversion
This paper explores the estimation and mapping of probability models of formant parameter vectors for voice conversion. The formant parameter vectors consist of the frequency, bandwidth and intensity of resonance at formants. Formant parameters are derived from the coefficients of a linear prediction (LP) model of speech. The formant distributions are modelled with phonemedependent two-dimensio...
متن کاملAll Your Voices are Belong to Us: Stealing Voices to Fool Humans and Machines
In this paper, we study voice impersonation attacks to defeat humans and machines. Equipped with the current advancement in automated speech synthesis, our attacker can build a very close model of a victim’s voice after learning only a very limited number of samples in the victim’s voice (e.g., mined through the Internet, or recorded via physical proximity). Specifically, the attacker uses voic...
متن کاملVoice Morphing Using the Generative Topographic Mapping
In this paper we address the problem of Voice Morphing. We attempt to transform the spectral characteristics of a source speakers speech signal so that the listener would believe that the speech was uttered by a target speaker. The voice morphing system transforms the spectral envelope as represented by a Linear Prediction model. The transformation is achieved by codebook mapping using the Gen...
متن کاملAudio Morphing
Approach: There are two variants of our work: inter-voice morphing and intra-voice morphing. In the intra-voice morphing scenario, a single person’s voice is recorded uttering a wide range of utterances. The speaker’s phones are then morphed in time to generate new utterances of the speaker. We note that intra-voice morphing addresses the same problem that concatenative speech synthesis algorit...
متن کامل